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NOVEL POLYPEPTIDES AND NUCLEIC ACIDS ENCODING THE SAME 

FIELD OF THE INVENTION 
The present invention relates generally to the identification and isolation of novel DNA and to the 
recombinant production of novel polypeptides. 

5 

BACKGROUND OF THE INVENTION 
Extracellular proteins play important roles in, among other things, the formation, differentiation and 
maintenance of multicellular organisms. The fate of many individual cells, e.g., proliferation, migration, 
differentiation, or interaction with other cells, is typically governed by information received from other cells 

10 and/or the immediate environment. This information is often transmitted by secreted polypeptides (for instance, 
raitogenic factors, survival factors, cytotoxic factors, differentiation factors, neuropeptides, and hormones) which 
are, in turn, received and interpreted by diverse cell receptors or membrane-bound proteins. These secreted 
polypeptides or signaling molecules normally pass through the cellular secretory pathway to reach their site of 
action in the extracellular environment. 

15 Secreted proteins have various industrial applications, including as pharmaceuticals, diagnostics, 

biosensors and bioreactors. Most protein drugs available at present, such as thrombolytic agents, interferons, 
interleukins, erythropoietins, colony stimulating factors, and various other cytokines, are secretory proteins. 
Their receptors, which are membrane proteins, also have potential as therapeutic or diagnostic agents. Efforts 
are being undertaken by both industry and academia to identify new, native secreted proteins. Many efforts are 

20 focused on the screening of mammalian recombinant DNA libraries to identify the coding sequences for novel 
secreted proteins. Examples of screening methods and techniques are described in the literature [see, for 
example, Klein et ah, Proc. Natl. Acad. Sci. 93:7108-7113 (1996); U.S. Patent No. 5,536,637)]. 

Membrane-bound proteins and receptors can play important roles in, among other things, the formation, 
differentiation and maintenance of multicellular organisms. The fate of many individual cells, e.g. , proliferation, 

25 migration, differentiation, or interaction with other cells, is typically governed by information received from 
other cells and/or the immediate environment. This information is often transmitted by secreted polypeptides 
(for instance, mitogenic factors, survival factors, cytotoxic factors, differentiation factors, neuropeptides, and 
hormones) which are, in turn, received and interpreted by diverse cell receptors or membrane-bound proteins. 
Such membrane-bound proteins and cell receptors include, but are not limited to, cytokine receptors, receptor 

30 kinases, receptor phosphatases, receptors involved in cell-cell interactions, and cellular adhesin molecules like 
selectins and integrins. For instance,, transduction of signals that regulate cell growth and differentiation is 
regulated in part by phosphorylation of various cellular proteins. Protein tyrosine kinases, enzymes that catalyze 
that process, can also act as growth factor receptors. Examples include fibroblast growth factor receptor and 
nerve growth factor receptor. 
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Membrane-bound proteins and receptor molecules have various industrial applications, including as 
pharmaceutical and diagnostic agents. Receptor immunoadhesins, for instance, can be employed as therapeutic 
agents to block receptor-ligand interactions. The membrane-bound proteins can also be employed for screening 
of potential peptide or small molecule inhibitors of the relevant receptor/ligand interaction. 

Efforts are being undertaken by both industry and academia to identify new, native receptor or 
5 membrane-bound proteins. Many efforts are focused on the screening of mammalian recombinant DNA libraries 
to identify the coding sequences for novel receptor or membrane-bound proteins. 

1. PRQ281 

A novel gene designated testis enhanced gene transcript (TEGT) has recently been identified in humans 
10 (Walter et al. f Genomics 20:301-304 (1995)). Recent results have shown that TEGT protein is developmentally 
regulated in the mammalian testis and possesses a nuclear targeting motif that allows the protein to localize to 
the nucleus (Walter et al., Mamm. Genome 5:216-221 (1994)). As such, it is believed that the TEGT protein 
plays an important role in testis development. There is, therefore, substantial interest in identifying and 
characterizing novel polypeptides having homology to the TEGT protein. We herein describe the identification 
15 and characterization of novel polypeptides having homology to TEGT protein, designated herein as PR0281 
polypeptides. 

2. PRQ276 

Efforts are being undertaken by both industry and academia to identify new, native membrane-bound 
20 proteins. Many of these efforts are focused on the screening of mammalian recombinant DNA libraries to 
identify the coding sequences for novel membrane-bound proteins. We herein describe the identification and 
characterization of novel transmembrane polypeptides, designated herein as PR0276 polypeptides. 

3. EEQ189 

25 Efforts are being undertaken by both industry and academia to identify new, native secreted proteins. 

Many of these efforts are focused on the screening of mammalian recombinant DNA libraries to identify the 
coding sequences for novel secreted proteins. We herein describe the identification and characterization of novel 
secreted polypeptides, designated herein as PRO 189 polypeptides. 

30 4. PEQ12Q 

Of particular interest are proteins having seven transmembrane domains (7TM), or more generally, all 
multiple transmembrane spanning proteins. Among multiple transmembrane spanning proteins are ion channels 
and transporters. Examples of transporters are the UDP-galactose transporter described in Ishida, et al., L 
Biochem .. 120(6): 1074-1078 (1996), and the CMP-sialic acid transporter described in Eckhardt, et al., PNAS . 
35 93(15):7572-7576 (1996). We herein describe the identification and characterization of novel transmembrane 
polypeptides, designated herein as PRO 190 polypeptides. 
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Haemost . (Germany), 74(1): 1 1 1-1 16 (July 1995), reporting that platelets have leucine rich repeats and Ruoslahti, 
E. I., et al., WO9110727-A by La Jolla Cancer Research Foundation reporting that decorin binding to 
transforming growth factorP has involvement in a treatment for cancer, wound healing and scarring. Related by 
function to this group of proteins is the insulin like growth factor (IGF), in that it is useful in wound-healing and 
associated therapies concerned with re-growth of tissue, such as connective tissue, skin and bone; in promoting 
5 body growth in humans and animals; and in stimulating other growth-related processes. The acid labile subunit 
of IGF (ALS) is also of interest in that it increases the half-life of IGF and is part of the IGF complex in vivo . 

Another protein which has been reported to have leucine-rich repeats is the SLIT protein which has been 
reported to be useful in treating neuro-degenerative diseases such as Alzheimer's disease, nerve damage such 
as in Parkinson's disease, and for diagnosis of cancer, see, Artavanistsakonas, S. and Romberg, J. M., 

10 WO9210518-A 1 by Yale University. Of particular interest is LIG-i , a membrane glycoprotein that is expressed 
specifically in glial cells in the mouse brain, and has leucine rich repeats and immunoglobulin-like domains. 
Suzuki, et al., J. Biol. Chem . (U.S.), 271(37):22522 (1996). Other studies reporting on the biological functions 
of proteins having leucine rich repeats include: Tayar, N., et al., Mol, Cell Endocrinol. . (Ireland), 125 (1-2): 65- 
70 (Dec. 1996) (gonadotropin receptor involvement); Miura, Y. , et al. , Nippon Rinsho (Japan). 54(7): 1784-1789 

15 (July 1996) (apoptosis involvement); Harris, P. C, et al., J. Am. Soc. Nephrol .. 6(4): 1 125-1 133 (Oct. 1995) 
(kidney disease involvement). 

We herein describe the identification and characterization of novel polypeptides having homology to 
LIG, designated herein as PROl 1 1 1 polypeptides. 

20 65. PRQ1344 

Factor C is a protein that is intimately involved with the coagulation cascade in a variety of organisms. 
The coagulation cascade has been shown to involve numerous different intermediate proteins, including factor 
C, all of whose activity is essential to the proper functioning of this cascade. Abnormal coagulation cascade 
function can result in a variety of serious abnormalities and, as such, the activities of the coagulation cascade 
25 proteins is of particular interest. As such, efforts are currently being undertaken to identify novel polypeptides 
having homology to one or more of the coagulation cascade proteins. 

We herein describe the identification and characterization of novel polypeptides having homology to 
factor C protein, designated herein as PRO 1344 polypeptides. 

30 66. PRO1109 

Carbohydrate chains on glycoproteins are important not only for protein conformation, transport and 
stability, but also for cell-cell and cell-matrix interactions. p-l,4-galactosyltransferase is an enzyme that is 
involved in producing carbohydrate chains on proteins, wherein the p-1 ,4-galactosyltransferase enzyme acts to 
transfer galactose to the terminal N-acetylglucosamine of complex-type N-glycans in the Golgi apparatus ( Asano 
35 et al., EMBO J. 16:1850-1857 (1997)). In addition, it has been suggested that P-l,4-galactosyltransferase is 
invloved directly in cell-cell interactions during fertilization and early embryogenesis through a subpopulation 
of this enzyme distributed on the cell surface. Specifically, Lu et al.. Development 124:4121-4131 (1997) and 
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the test DNA molecule under conditions suitable for expression of the polypeptide, and (iii) recovering the 
polypeptide from the cell culture. 

In yet another embodiment, the invention concerns agonists and antagonists of a native PROllll 
polypeptide. In a particular embodiment, the agonist or antagonist is an ami-PROl 1 1 1 antibody. 

In a further embodiment, the invention concerns a method of identifying agonists or antagonists of a 
5 native PROllll polypeptide, by contacting the native PROllll polypeptide with a candidate molecule and 
monitoring a biological activity mediated by said polypeptide. 

In a still further embodiment, the invention concerns a composition comprising a PRO 11 1 1 polypeptide, 
or an agonist or antagonist as hereinabove defined, in combination with a pharmaceutically acceptable carrier. 

10 65. PRQ1344 

A cDNA clone (DNA58723-1588) has been identified, having homology to nucleic acid encoding factor 
C that encodes a novel polypeptide, designated in the present application as "PRO 1344". 

In one embodiment, the invention provides an isolated nucleic acid molecule comprising DNA encoding 
a PRO 1 344 polypeptide . 

15 In one aspect, the isolated nucleic acid comprises DNA having at least about 80% sequence identity, 

preferably at least about 85% sequence identity, more preferably at least about 90% sequence identity, most 
preferably at least about 95% sequence identity to (a) a DNA molecule encoding a PR01344 polypeptide having 
the sequence of amino acid residues from about 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID 
NO:231), or (b) the complement of the DNA molecule of (a). 

20 In another aspect, the invention concerns an isolated nucleic acid molecule encoding a PRO 1344 

polypeptide comprising DNA hybridizing to the complement of the nucleic acid between about nucleotides 26 
or about 95 and about 2185, inclusive, of Figure 158 (SEQ ID NO:230). Preferably, hybridization occurs under 
stringent hybridization and wash conditions. 

In a further aspect, the invention concerns an isolated nucleic acid molecule comprising DNA having 

25 at least about 80% sequence identity, preferably at least about 85% sequence identity, more preferably at least 
about 90% sequence identity, most preferably at least about 95% sequence identity to (a) a DNA molecule 
encoding the same mature polypeptide encoded by the human protein cDNA in ATCC Deposit No. 203133 
(DNA58723-1588) or (b) the complement of the nucleic acid molecule of (a). In a preferred embodiment, the 
nucleic acid comprises a DNA encoding the same mature polypeptide encoded by the human protein cDNA in 

30 ATCC Deposit No. 203133 (DNA58723-1588). 

In still a further aspect, the invention concerns an isolated nucleic acid molecule comprising (a) DNA 
encoding a polypeptide having at least about 80% sequence identity, preferably at least about 85% sequence 
identity, more preferably at least about 90% sequence identity, most preferably at least about 95% sequence 
identity to the sequence of amino acid residues 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID 

35 NO:23 1 ), or (b) the complement of the DNA of (a). 

In a further aspect, the invention concerns an isolated nucleic acid molecule having at least 10 
nucleotides and produced by hybridizing a test DNA molecule under stringent conditions with (a) a DNA 
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molecule encoding a PRO 1344 polypeptide having the sequence of amino acid residues from 1 or about 24 to 
about 720, inclusive of Figure 159 (SEQ ID NO:23 1), or (b) the complement of the DNA molecule of (a), and, 
if the DNA molecule has at least about an 80 % sequence identity, prefereably at least about an 85% sequence 
identity, more preferably at least about a 90% sequence idemity, most preferably at least about a 95% sequence 
identity to (a) or (b), isolating the test DNA molecule. 
5 In a specific aspect, the invention provides an isolated nucleic acid molecule comprising DNA encoding 

a PR01344 polypeptide, with or without the N-terminal signal sequence and/or the initiating methionine, or is 
complementary to such encoding nucleic acid molecule. The signal peptide has been tentatively identified as 
extending from about amino acid position 1 to about amino acid position 23 in the sequence of Figure 159 (SEQ 
ID NO:231). 

In another aspect, the invention concerns an isolated nucleic acid molecule comprising (a) DNA 
encoding a polypeptide scoring at least about 80% positives, preferably at least about 85% positives, more 
preferably at least about 90% positives, most preferably at least about 95% positives when compared with the 
amino acid sequence of residues 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID NO:231), or (b) 
the complement of the DNA of (a). 

Another embodiment is directed to fragments of a PRO 1344 polypeptide coding sequence that may find 
use as hybridization probes. Such nucleic acid fragments are from about 20 to about 80 nucleotides in length, 
preferably from about 20 to about 60 nucleotides in length, more preferably from about 20 to about 50 
nucleotides in length and most preferably from about 20 to about 40 nucleotides in length and may be derived 
from the nucleotide sequence shown in Figure 158 (SEQ ID NO: 230). 

In another embodiment, the invention provides isolated PRO 1344 polypeptide encoded by any of the 
isolated nucleic acid sequences hereinabove identified. 

In a specific aspect, the invention provides isolated native sequence PR01344 polypeptide, which in 
certain embodiments, includes an amino acid sequence comprising residues 1 or about 24 to about 720 of Figure 
159 (SEQ ID NO:231). 

In another aspect, the invention concerns an isolated PR01344 polypeptide, comprising an amino acid 
sequence having at least about 80% sequence identity, preferably at least about 85% sequence identity, more 
preferably at least about 90% sequence identity, most preferably at least about 95% sequence identity to the 
sequence of amino acid residues 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID NO:231). 

In a further aspect, the invention concerns an isolated PRO 1344 polypeptide, comprising an amino acid 
sequence scoring at least about 80% positives, preferably at least about 85% positives, more preferably at least 
about 90% positives, most preferably at least about 95% positives when compared with the amino acid sequence 
of residues 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID NO:231). 

In yet another aspect, the invention concerns an isolated PRO 1344 polypeptide, comprising the sequence 
of amino acid residues 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID NO:231), or a fragment 
thereof sufficient to provide a binding site for an anti-PR01344 antibody. Preferably, the PR01344 fragment 
retains a qualitative biological activity of a native PRO 1344 polypeptide. 
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In a still further aspect, the invention provides a polypeptide produced by (i) hybridizing a test DNA 
molecule under stringent conditions with (a) a DNA molecule encoding a PRO 1344 polypeptide having the 
sequence of amino acid residues from about 1 or about 24 to about 720, inclusive of Figure 159 (SEQ ID 
NO:231), or (b) the complement of the DNA molecule of (a), and if the test DNA molecule has ai least about 
an 80% sequence identity, preferably at least about an 85% sequence identity, more preferably at least about a 
5 90% sequence identity, most preferably at least about a 95% sequence identity to (a) or (b), (ii) culturing a host 
cell comprising the test DNA molecule under conditions suitable for expression of the polypeptide, and (iii) 
recovering the polypeptide from the cell culture. 

In yet another embodiment, the invention concerns agonists and antagonists of a native PRO 1344 
polypeptide. In a particular embodiment, the agonist or antagonist is an anti-PR01344 antibody. 
10 In a further embodiment, the invention concerns a method of identifying agonists or antagonists of a 

native PRO 1344 polypeptide by contacting the native PRO 1344 polypeptide with a candidate molecule and 
monitoring a biological activity mediated by said polypeptide. 

In a still further embodiment, the invention concerns a composition comprising a PRO 1344 polypeptide, 
or an agonist or antagonist as hereinabove defined, in combination with a pharmaceutically acceptable carrier. 

15 

66. PRO1109 

A cDNA clone (DNA58737-1473) has been identified, having homology to nucleic acid encoding P-l ,4- 
galactosyltransferase, that encodes a novel polypeptide, designated in the present application as "PRO 1 109". 

In one embodiment, the invention provides an isolated nucleic acid molecule comprising DNA encoding 

20 a PRO 1 109 polypeptide. 

In one aspect, the isolated nucleic acid comprises DNA having at least about 80% sequence identity, 
preferably at least about 85% sequence identity, more preferably at least about 90% sequence identity, most 
preferably at least about 95% sequence identity to (a) a DNA molecule encoding a PROl 109 polypeptide having 
the sequence of amino acid residues from about I or about 28 to about 344, inclusive of Figure 161 (SEQ ID 

25 NO: 236), or (b) the complement of the DNA molecule of (a). 

In another aspect, the invention concerns an isolated nucleic acid molecule encoding a PRO 1109 
polypeptide comprising DNA hybridizing to the complement of the nucleic acid between about nucleotides 1 19 
or about 200 and about 1150, inclusive, of Figure 160 (SEQ ID NO:235). Preferably, hybridization occurs 
under stringent hybridization and wash conditions. 

30 In a further aspect, the invention concerns an isolated nucleic acid molecule comprising DNA having 

at least about 80% sequence identity, preferably at least about 85% sequence identity, more preferably at least 
about 90% sequence identity, most preferably at least about 95% sequence identity to (a) a DNA molecule 
encoding the same mature polypeptide encoded by the human protein cDNA in ATCC Deposit No. 203136 
(DNA58737-I473) or (b) the complement of the nucleic acid molecule of (a). In a preferred embodiment, the 

35 nucleic acid comprises a DNA encoding the same mature polypeptide encoded by the human protein cDNA in 
ATCC Deposit No. 203136 (DNA58737-1473). 
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2. Selection and Transformation of Host Cells 
Host cells are transfected or transformed with expression or cloning vectors described herein for PRO 
production and cultured in conventional nutrient media modified as appropriate for inducing promoters, selecting 
transformants, or amplifying the genes encoding the desired sequences. The culture conditions, such as media, 
temperature, pH and the like, can be selected by the skilled artisan without undue experimentation. In general, 
5 principles, protocols, and practical techniques for maximizing the productivity of cell cultures can be found in 
Mammalian Cell Biotechnology: a Practical Approach. M. Butler, ed. (IRL Press, 1991) and Sambrook et al., 
supra . 

Methods of eukaryotic cell transfection and prokaryotic cell transformation are known to the ordinarily 
skilled artisan, for example, CaCl 2 , CaP0 4 , liposome-mediated and electroporation. Depending on the host cell 

10 used, transformation is performed using standard techniques appropriate to such cells. The calcium treatment 
employing calcium chloride, as described in Sambrook et al., supra , or electroporation is generally used for 
prokaryotes. Infection with Agrobacterium tumefaciens is used for transformation of certain plant cells, as 
described by Shaw etal., Qene, 23:315 (1983) and WO 89/05859 published 29 June 1989. For mammalian cells 
without such cell walls, the calcium phosphate precipitation method of Graham and van der Eb, Virology . 

15 52:456-457 (1978) can be employed. General aspects of mammalian cell host system transfections have been 
described in U.S. Patent No. 4,399,216. Transformations into yeast are typically carried out according to the 
method of VanSolingen etal., OacL. 130:946(1977) and Hsiao et al., Proc. Natl. Acad. Sci. OJSA1. 76:3829 
(1979). However, other methods for introducing DNA into cells, such as by nuclear microinjection, 
electroporation, bacterial protoplast fusion with intact cells, or polycations, e.g. , polybrene, polyornithine, may 

20 also be used. For various techniques for transforming mammalian cells, see Keown et al., Methods in 
Enzvmologv. 185:527-537 (1990) and Mansour et al., Nature. 336:348-352 (1988). 

Suitable host cells for cloning or expressing the DNA in the vectors herein include prokaryote, yeast, 
or higher eukaryote cells. Suitable prokaryotes include but are not limited to eubacteria, such as Gram-negative 
or Gram-positive organisms, for example, Enterobacteriaceae such as £. coli. Various £. coli strains are 

25 publicly available, such as £. coli K12 strain MM294 (ATCC 31,446); £. coli X1776 (ATCC 31,537); E. coli 
strain W3110 (ATCC 27,325) and K5 772 (ATCC 53,635). Other suitable prokaryotic host cells include 
Enterobacteriaceae such as Escherichia, e.g., £. coli, Enterobacter, Erwinia, Klebsiella, Proteus, Salmonella, 
e.g., Salmonella typhimurium, Serratia, e.g., Serratia marcescans, and Shigella, as well as Bacilli such as B. 
subtilis and B. licheniformis (e.g., B. licheniformis 4 IP disclosed in DD 266,710 published 12 April 1989), 

30 Pseudomonas such as P. aeruginosa, and Streptomyces . These examples are illustrative rather than limiting. 
Strain W3 1 10 is one particularly preferred host or parent host because it is a common host strain for recombinant 
DNA product fermentations. Preferably, the host cell secretes minimal amounts of proteolytic enzymes. For 
example, strain W31 10 may be modified to effect a genetic mutation in the genes encoding proteins endogenous 
to the host, with examples of such hosts including £. coli W3 110 strain 1 A2, which has the complete genotype 

35 tonA ; £. coli W3110 strain 9E4, which has the complete genotype tonA ptr3; £. coli W3110 strain 27C7 
(ATCC 55,244), which has the complete genotype tonA ptr3phoA El 5 (argF-lac)169 degP ompT karf \ £. coli 
W31 10 strain 37D6, which has the complete genotype tonA ptr3 phoA El 5 (argF-lac)169 degP ompT rbs7 
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ilvG kan r \ E. coli W3110 strain 40B4, which is strain 37D6 with a non-kanamycin resistant degP deletion 
mutation; and an E. coli strain having mutant periplasmic protease disclosed in U.S. Patent No. 4,946,783 issued 
7 August 1990. Alternatively, in vitro methods of cloning, e.g. . PCR or other nucleic acid polymerase reactions, 
are suitable. 

In addition to prokaryotes, eukaryotic microbes such as filamentous fungi or yeast are suitable cloning 
5 or expression hosts for PRO-encoding vectors. Saccharomyces cerevisiae is a commonly used lower eukaryotic 
host microorganism. Others include Schizosaccharomyces pombe (Beach and Nurse, Nature . 290: 140 [1981]; 
EP 139,383 published 2 May 1985); Kluyveromyces hosts (U.S. Patent No. 4,943,529; Fleer et ah, 
Bio/Technolo2v. 9:968-975 (1991)) such as, e.g., K. lactis (MW98-8C, CBS683, CBS4574; Louvencourt et al. , 
J. Bacteriol. . 737 [1983]), K.fragilis (ATCC 12,424), K. bulgaricus (ATCC 16,045), K. mckeramii (ATCC 

10 24,178), K. walrii (ATCC 56,500), K. drosophilarum (ATCC 36,906; Van den Berg et al., Bio/Technology. 
8:135 (1990)), K. thermotolerans, and K. marxianus; yarroma (EP 402,226); Pichia pastoris (EP 183,070; 
Sreekrishna et al., J. Basic Microbiol.. 28:265-278 [1988]); Candida; Trichoderma reesia (EP 244,234); 
Neurospora crassa (Case et al., Proc. Natl. A cad. Sci. USA . 76:5259-5263 [1979]); Schwanniomyces such as 
Schwanniomyces occidentalis (EP 394,538 published 31 October 1990); and filamentous fungi such as, e.g., 

1 5 Neurospora, Penicillium, Totypocladium (WO 91/00357 published 1 0 January 199 1), and Aspergillus hosts such 
as A. nidulans (Ballance et al., Biochem. Biophvs. Res. Commun.. 1 12:284-289 [1983]; Tilburn et al., Gene . 
26:205-221 [1983]; Yeltonet al., Proc. Natl. Aca d. Sci. USA . 81: 1470-1474 [1984]) and A, niger (Kelly and 
Hynes, EMBO J . . 4:475-479 [1985]). Methylotropic yeasts are suitable herein and include, but are not limited 
to, yeast capable of growth on methanol selected from the genera consisting of Hansenula, Candida, Kloeckera, 

20 Pichia, Saccharomyces, Torulopsis, and Rhodotorula. A list of specific species that are exemplary of this class 
of yeasts may be found in C. Anthony, The Biochemistry of Methvlotrophs. 269 (1982). 

Suitable host cells for the expression of glycosylated PRO are derived from multicellular organisms. 
Examples of invertebrate cells include insect cells such as Drosophila S2 and Spodoptera Sf9, as well as plant 
cells. Examples of useful mammalian host cell lines include Chinese hamster ovary (CHO) and COS cells. 

25 More specific examples include monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); 
human embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture, Graham et al, J._ 
Gen Virol. . 36:59 (1977)); Chinese hamster ovary cells/-DHFR (CHO, Urlaub and Chasin, Proc. Natl. Acad. 
Sci. USA . 77:4216 (1980)); mouse Sertoli cells (TM4, Mather, Biol. Reprod.. 23:243-251 (1980)); human lung 
cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); and mouse mammary tumor (MMT 

30 060562, ATCC CCL51). The selection of the appropriate host cell is deemed to be within the skill in the art. 

3. Selection and Use of a Remicable Vector 
The nucleic acid (e.g., cDNA or genomic DNA) encoding PRO may be inserted into a replicable vector 
for cloning (amplification of the DNA) or for expression. Various vectors are publicly available. The vector 
35 may, for example, be in the form of a plasmid, cosmid, viral particle, or phage. The appropriate nucleic acid 
sequence may be inserted into the vector by a variety of procedures. In general, DNA is inserted into an 
appropriate restriction endonuclease site(s) using techniques known in the art. Vector components generally 
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include, but are not limited to, one or more of a signal sequence, an origin of replication, one or more marker 
genes, an enhancer element, a promoter, and a transcription termination sequence. Construction of suitable 
vectors containing one or more of these components employs standard ligation techniques which are known to 
the skilled artisan. 

The PRO may be produced recombinantly not only directly, but also as a fusion polypeptide with a 
5 heterologous polypeptide, which may be a signal sequence or other polypeptide having a specific cleavage site 
at the N-terminus of the mature protein or polypeptide. In general, the signal sequence may be a component of 
the vector, or it may be a part of the PRO-encoding DNA that is inserted into the vector. The signal sequence 
may be a prokaryotic signal sequence selected, for example, from the group of the alkaline phosphatase, 
penicillinase, lpp, or heat-stable enterotoxin II leaders. For yeast secretion the signal sequence may be, e.g., 

10 the yeast invertase leader, alpha factor leader (including Saccharomyces and Kluyveromyces a-factor leaders, 
the latter described in U.S. Patent No. 5,010,182), or acid phosphatase leader, the C albicans glucoamylase 
leader (EP 362,179 published 4 April 1990), or the signal described in WO 90/13646 published 15 November 
1990. In mammalian cell expression, mammalian signal sequences may be used to direct secretion of the 
protein, such as signal sequences from secreted polypeptides of the same or related species, as well as viral 

15 secretory leaders. 

Both expression and cloning vectors contain a nucleic acid sequence that enables the vector to replicate 
in one or more selected host cells. Such sequences are well known for a variety of bacteria, yeast, and viruses. 
The origin of replication from the plasmid pBR322 is suitable for most Gram-negative bacteria, the 2p plasraid 
origin is suitable for yeast, and various viral origins (SV40. polyoma, adenovirus, VSV or BPV) are useful for 

20 cloning vectors in mammalian cells. 

Expression and cloning vectors will typically contain a selection gene, also termed a selectable marker. 
Typical selection genes encode proteins that (a) confer resistance to antibiotics or other toxins, e.g., ampicillin, 
neomycin, methotrexate, or tetracycline, (b) complement auxotrophic deficiencies, or (c) supply critical nutrients 
not available from complex media, e.g., the gene encoding D-alanine racemase for Bacilli. 

25 An example of suitable selectable markers for mammalian cells are those that enable the identification 

of cells competent to take up the PRO-encoding nucleic acid, such as DHFR or thymidine kinase. An 
appropriate host cell when wild-type DHFR is employed is the CHO cell line deficient in DHFR activity, 
prepared and propagated as described by Urlaubetal., Proc. Natl. Acad. Sci. PSA 77-47ifi/iQRm Asuitable 
selection gene for use in yeast is the trp\ gene present in the yeast plasmid YRp7 [Stinchcomb et al., Nature . 

30 282:39 (1979); Kingsman et al., Gene, 7:141 (1979); Tschemper et al., Gene . 10:157 (1980)]. The trp\ gene 
provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example, 
ATCC No. 44076 or PEP4-1 [Jones; Genetics. 85:12 (1977)]. 

Expression and cloning vectors usually contain a promoter operably linked to the PRO-encoding nucleic 
acid sequence to direct mRNA synthesis. Promoters recognized by a variety of potential host cells are well 

35 known. Promoters suitable for use with prokaryotic hosts include the P-lactamase and lactose promoter systems 
(Chang et al., Nature, 275:615 (1978); Goeddel et al., Nature . 281:544 (1979)], alkaline phosphatase, a 
tryptophan (trp) promoter system [Goeddel, Nucleic Acids Res. 8:4057 (1980); EP 36,776], and hybrid 
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promoters such as the tac promoter [deBoer et al., Proc. Natl. Acad. Sci. USA. 80:21-25 (1983)]. Promoters 
for use in bacterial systems also will contain a Shine-Dalgarno (S.D.) sequence operably linked to the DNA 
encoding PRO. 

Examples of suitable promoting sequences for use with yeast hosts include the promoters for 3- 
phosphoglycerate kinase [Hitzeman et al., J. Biol. Chem.. 255:2073 (1980)] or other glycolytic enzymes [Hess 
5 et al., J. Adv. Enzyme Reg.. 7:149 (1968); Holland, Biochemistry. 17:4900 (1978)], such as enolase, 
glyceraldehyde-3-phosphate dehydrogenase, hexokinase,pymvatedecarboxylase,phosphofructokinase, glucose- 
6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphateisomerase.phosphoglucose 
isomerase, and glucokinase. 

Other yeast promoters, which are inducible promoters having the additional advantage of transcription 

10 controlled by growth conditions, are the promoter rejgions for alcohol dehydrogenase 2, isocytochrome C, acid 
phosphatase, degradative enzymes associated with nitrogen metabolism, metallothionein, glyceraldehyde-3- 
phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization. Suitable vectors and 
promoters for use in yeast expression are further described in EP 73,657. 

PRO transcription from vectors in mammalian host cells is controlled, for example, by promoters 

15 obtained from the genomes of viruses such as polyoma virus, fowlpox virus (UK 2,21 1,504 published 5 July 
1989), adenovirus (such as Adenovirus 2), bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a 
retrovirus, hepatitis-B virus and Simian Virus 40 (SV40), from heterologous mammalian promoters, e.g., the 
actin promoter or an immunoglobulin promoter, and from heat-shock promoters, provided such promoters are 
compatible with the host cell systems. 

20 Transcription of a DNA encoding the PRO by higher cukaryotes may be increased by inserting an 

enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 
bp, that act on a promoter to increase its transcription. Many enhancer sequences are now known from 
mammalian genes (globin, elastase, albumin, a- fetoprotein, and insulin). Typically, however, one will use an 
enhancer from a eukaryotic cell virus. Examples include the SV40 enhancer on the late side of the replication 

25 origin (bp 100-270), the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the 
replication origin, and adenovirus enhancers. The enhancer may be spliced into the vector at a position 5' or 
3' to the PRO coding sequence, but is preferably located at a site 5' from the promoter. 

Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human, or nucleated 
cells from other multicellular organisms) will also contain sequences necessary for the termination of 

30 transcription and for stabilizing the mRNA. Such sequences are commonly available from the 5* and, 
occasionally 3', untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions contain nucleotide 
segments transcribed as polyadenylated fragments in the untranslated portion of the mRNA encoding PRO. 

Still other methods, vectors, and host cells suitable for adaptation to the synthesis of PRO in 
recombinant vertebrate cell culture are described in Gething et al., Nature . 293:620-625 (1981); Mantei et al., 

35 Nature . 281:40-46 (1979); EP 117,060; and EP 117,058. 
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